Lab 2 - Indicators

Author

Nissim Lebovits

Published

September 15, 2023

Intro

Prepare a policy brief for the local City Council representatives. Do households value transit-rich neighborhoods compared to others? How certain can you be about your conclusions given some of the spatial biases we’ve discussed? You must choose a city with open transit station data. (You may do this analysis using data from the 2000 decennial census - as in the book - as the first time period in the analysis OR you may use 2009 ACS 5-year estimates. You may use ACS data of a recent year in place of 2017 if you wish).

Prepare an accessible (non-technical) R markdown document with the following deliverables. Provide a brief motivation at the beginning, annotate each visualization appropriately, and then provide brief policy-relevant conclusions. Please do not suppress code blocks - but you should fold them to make your assignment more legible. Here are the specific deliverables:

Show your data wrangling work.

  1. Four small-multiple (2000 /2009 & 2017+) visualizations comparing four selected Census variables across time and space (TOD vs. non-TOD).
  2. One grouped bar plot making these same comparisons.
  3. One table making these same comparisons.
  4. Create two graduated symbol maps of population and rent within 0.5 mile of each transit station. Google for more information, but a graduate symbol map represents quantities for each transit station proportionally.
  5. Create a geom_line plot that shows mean rent as a function of distance to subway stations (Figure 1.17). To do this you will need to use the multipleRingBuffer function found in the functions.R script.
Show the code
library(tidyverse)
library(tidycensus)
library(sf)
library(ggthemr)
library(kableExtra)
library(tmap)
library(janitor)
library(sfdep)
library(arcpullr)

source("https://raw.githubusercontent.com/urbanSpatial/Public-Policy-Analytics-Landing/master/functions.r")

options(tigris_use_cache = TRUE, scipen = 999)
tmap_mode('view')
ggthemr('flat')

state <- "CA"
county <- "San Francisco"
crs <- 'EPSG:7132' # NAD 1983 us feet for san fran
acs_vars <- c("B25026_001E",
              "B02001_002E",
              "B15001_050E",
              "B15001_009E",
              "B19013_001E", 
              "B25058_001E",
              "B06012_002E")

palette5 <- c("#f0f9e8","#bae4bc","#7bccc4","#43a2ca","#0868ac")
Show the code
suppressMessages(
tracts10 <- get_acs(geography = "tract",
                    variables = acs_vars, 
                    year=2010, 
                    state=state,
                    county=county, 
                    geometry=TRUE,
                    output = "wide") %>% 
          st_transform(crs = crs) %>%
          rename(TotalPop = B25026_001E, 
                 Whites = B02001_002E,
                 FemaleBachelors = B15001_050E, 
                 MaleBachelors = B15001_009E,
                 MedHHInc = B19013_001E, 
                 MedRent = B25058_001E,
                 TotalPoverty = B06012_002E) %>%
          mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop, 0),
                 pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop), 0),
                 pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
                 year = "2010") %>%
          dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty, -ends_with("M")) %>%
          filter(!st_is_empty(geometry)) %>% ## there's a tract with an empty geom; drop it
          mutate(nb = as.character(st_contiguity(geometry))) %>%
          filter(nb != 0) #filter out tracts not on the mainland (i.e., w no contiguous neighbors)
)

suppressMessages(
tracts20 <- get_acs(geography = "tract",
                    variables = acs_vars, 
                    year=2020, 
                    state=state,
                    county=county, 
                    geometry=TRUE,
                    output = "wide") %>% 
          st_transform(crs = crs) %>%
          rename(TotalPop = B25026_001E, 
                 Whites = B02001_002E,
                 FemaleBachelors = B15001_050E, 
                 MaleBachelors = B15001_009E,
                 MedHHInc = B19013_001E, 
                 MedRent = B25058_001E,
                 TotalPoverty = B06012_002E) %>%
          mutate(pctWhite = ifelse(TotalPop > 0, Whites / TotalPop, 0),
                 pctBachelors = ifelse(TotalPop > 0, ((FemaleBachelors + MaleBachelors) / TotalPop), 0),
                 pctPoverty = ifelse(TotalPop > 0, TotalPoverty / TotalPop, 0),
                 year = "2020") %>%
          dplyr::select(-Whites, -FemaleBachelors, -MaleBachelors, -TotalPoverty, -ends_with("M")) %>%
          filter(!st_is_empty(geometry)) %>% ## there's a tract with an empty geom; drop it
          mutate(nb = as.character(st_contiguity(geometry))) %>%
          filter(nb != 0) #filter out tracts not on the mainland (i.e., w no contiguous neighbors)
)

allTracts <- rbind(tracts10, tracts20)
Show the code
tmap_mode('view')

tm_shape(allTracts) +
  tm_polygons(col = "TotalPop", alpha = 0.7, border.alpha = 0, style = "quantile", palette = "magma") +
  tm_layout(frame = FALSE) +
  tm_facets(by = "year", nrow = 2)
Show the code
tm_shape(sfmta_rail_stops_buffer_union) +
  tm_polygons(col = "lightblue", alpha = 0.5, border.alpha = 0) +
tm_shape(sfmta_rail_routes) +
  tm_lines() +
tm_shape(sfmta_rail_stops) +
  tm_dots() + 
  tm_scale_bar(position=c("left", "bottom"))

There must be a better way to do this filtering

Show the code
suppressWarnings(
selectCentroids <- st_centroid(allTracts)[sfmta_rail_stops_buffer_union, ] %>%
                    st_drop_geometry() %>%
                    left_join(., dplyr::select(allTracts, GEOID), by = "GEOID") %>%
                    distinct(GEOID, year, .keep_all = TRUE) %>%
                    st_sf() %>%
                    mutate(TOD = "TOD")
)

suppressWarnings(
  antiSelectCentroids <- st_centroid(allTracts)[sfmta_rail_stops_buffer_union, op = st_disjoint] %>%
                        st_drop_geometry() %>%
                        left_join(., dplyr::select(allTracts, GEOID), by = "GEOID") %>% 
                        distinct(GEOID, year, .keep_all = TRUE) %>%
                        st_sf() %>%
                        mutate(TOD = "Non-TOD")
)

tractsGroup <- rbind(selectCentroids, antiSelectCentroids) %>% mutate(medRentInf = ifelse(year == "2010", MedRent * 1.19, MedRent)) #per BLS: https://data.bls.gov/cgi-bin/cpicalc.pl?cost1=1&year1=201001&year2=202001
Show the code
tm_shape(tractsGroup) +
  tm_polygons(alpha = 0.4, border.col = "white", col = "TOD", palette = "magma") +
tm_facets(by = "year")
Show the code
# tm_shape(sfmta_rail_routes) +
#   tm_lines() +
# tm_shape(sfmta_rail_stops) +
#   tm_dots() + 
#   tm_scale_bar(position=c("left", "bottom"))

(Need to add captions, etc. here.)

Show the code
tm_shape(tractsGroup) +
  tm_polygons(alpha = 0.4, border.col = "white", col = "medRentInf", style = "quantile", palette = "magma") +
tm_facets(by = "year") +
tm_shape(sfmta_rail_routes) +
  tm_lines() +
tm_shape(sfmta_rail_stops) +
  tm_dots() + 
  tm_scale_bar(position=c("left", "bottom")) 
Show the code
allTracts.Summary <- 
  st_drop_geometry(tractsGroup) %>%
    group_by(year, TOD) %>%
    summarize(Rent = mean(MedRent, na.rm = T),
              Population = mean(TotalPop, na.rm = T),
              Percent_White = mean(pctWhite, na.rm = T),
              Percent_Bach = mean(pctBachelors, na.rm = T),
              Percent_Poverty = mean(pctPoverty, na.rm = T))

kable(allTracts.Summary) %>%
  kable_styling() %>%
  footnote(general_title = "\n",
           general = "Table 1.2")
year TOD Rent Population Percent_White Percent_Bach Percent_Poverty
2010 Non-TOD 1369.455 3958.687 0.5424141 0.0265217 0.1064100
2010 TOD 1289.979 3984.177 0.5337059 0.0240804 0.1432555
2020 Non-TOD 2039.393 3531.652 0.4843022 0.0234164 0.0981562
2020 TOD 1939.268 3571.000 0.4644013 0.0284417 0.1215223

Table 1.2
Show the code
allTracts.Summary %>%
  rename(
    pctBach = Percent_Bach,
    pctPov = Percent_Poverty,
    pctWht = Percent_White,
    pop = Population,
    rent = Rent
  ) %>%
  gather(Variable, Value, -year, -TOD) %>%
  ggplot(aes(year, Value, fill = TOD)) +
    geom_bar(stat = "identity", position = "dodge") +
    facet_wrap(~Variable, scales = "free", ncol=5) +
    scale_fill_manual(values = c("#bae4bc", "#0868ac")) +
    labs(title = "Indicator differences across time and space") +
    plotTheme() + 
  theme(legend.position="bottom",
        aspect.ratio = 1)

Buffers

Intersections

Spatial Clip

Spatial Intersections

Small Multiple of Three Operations

Indicator Maps

TOD Indicator Tables

TOD Indicator Plots

Examining Three Submarkets

Using the multipleRingBuffer() Function